Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve content ai safety logging and responses #12

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mrickettsk
Copy link
Collaborator

@mrickettsk mrickettsk commented Aug 4, 2024

🤖AEP PR SUMMARY🤖

  • .github/workflows/run-test-suite.yml:

    • Modified the value of SYSTEM_PROMPT_FILE to "system_prompts/prompts.json".
    • Updated the azure_openai_api_version to "2023-12-01-preview".
    • Updated the installation process for dependencies and the testing commands.
  • docs/mkdocs/aep-docs/docs/design.md:

    • Added information about how AEP uses Azure's AI Content Safety Service to detect prompt injections in the generated text.
    • Provided details on the splitting and flagging of prompts based on their character count.
  • src/helpers/open_ai.py:

    • Added a new method call_ai_content_safety for checking prompt injection and handling responses from the AI Content Safety Service.
  • src/helpers/prompts.py:

    • Refactored the check_for_prompt_inj method to handle prompt injection detection based on character count and checking individual prompt chunks using openai.call_ai_content_safety.
  • src/main.py:

    • Enhanced injection detection handling in create_predefined_query and create_info_query methods.
    • Included logging and detailed error handling for prompt injection detection.
    • Added an endpoint for healthcheck.
  • src/tests/test_prompt_ops.py:

    • Included test cases for prompt compression and prompt injection detection.

Copy link

github-actions bot commented Aug 4, 2024

Code Review Feedback

Security & Best Practices

  • Hardcoded API endpoint/version: The API version and endpoint for Azure's AI Content Safety Service are hardcoded in open_ai.py. It's better to externalize these values to a configuration file or environment variables for easier updates and management.

    • Example Improvement: In open_ai.py, replace the hardcoded API endpoint and version with variables loaded from a configuration file or environment variables.
      python
      url = f"{config.azure_cs_endpoint}/contentsafety/text:shieldPrompt?api-version={config.azure_api_version}"
      
      
  • Exception Logging: When logging exceptions, it's useful to include the traceback for easier debugging. The current implementation logs only the error message.

    • Example Improvement: Use logging.exception which automatically includes the traceback in the log entry.
      except Exception as err:
          event_logger.exception(\"Failed to check for prompt injection in response from AI Content Safety\")
  • API Key Security: The API key for Azure's AI Content Safety Service is included in request headers. Ensure this key is securely stored (e.g., environment variable) and not exposed in logs or error messages.

    • Best Practice: Ensure sensitive keys and credentials are not logged or hardcoded in the source files. Use a secure secret management solution.

Code Quality

  • JSON Data Construction: In the call_ai_content_safety function, when constructing the data dictionary, a string interpolation is used unnecessarily.

    • Example Improvement: Directly use the variable without string formatting.
      data = {
          \"documents\": [prompt]
      }
  • DRY Principle: The error handling and logging for exceptions are repetitive across the codebase. Consider creating a utility function for logging exceptions.

    • Example Improvement: Implement a function in a utility module to standardize error logging.
      def log_error(exception, message=\"\"):
          event_logger.error(f\"{message} | Exception: {exception}\")
  • API Request Failure Handling: When making an API request in call_ai_content_safety, the failure to get a successful HTTP status code is not explicitly handled. It's a good practice to check the response status code before processing the response data.

    • Example Improvement: Check response.status_code before parsing the JSON response.
      if response.status_code == 200:
          response_json = response.json()
      else:
          event_logger.error(f\"API Request failed with status {response.status_code}\")
          return False
  • No newline at end of file in call_openai: This is a minor issue, but it's a good practice to always include a newline at the end of files as some Unix tools expect or require it.

Functionality Improvements

  • Chunk Handling in check_for_prompt_inj:
    • If the character count exceeds 10,000, the prompt is split into chunks. However, if exactly 10,000 characters are provided, it might be beneficial to handle as a single request rather than splitting. This seems to be correctly handled, but ensuring such edge cases are thoughtfully considered in the logic is vital.
  • Unicode Character Calculation: You calculate the prompt size using unicode_count = len(prompt.encode('utf-8')). This actually gives you the byte length, not the character count. For most use-cases, counting characters directly with len(prompt) is sufficient and avoids confusion, unless you specifically need to account for byte size.
    • Example Clarification: If the intent is indeed to limit based on byte size, consider renaming the variable to reflect that (e.g., prompt_byte_size), and include a comment explaining why this approach was chosen.

Copy link

Review of Git Diff

.github/workflows/run-test-suite.yml

  1. Versioning of Actions:

    • Consider pinning the actions/checkout and actions/setup-python to a more specific version (or even a commit SHA) rather than v2 to ensure the build's consistency over time. For instance, actions/[email protected] or its equivalent commit SHA.
    • Example:
      yaml
      • uses: actions/checkout@v2
      
      
  2. Dependency Caching:

    • Implement caching for the pip dependencies to reduce build time and decrease network bandwidth usage. Here's an example snippet that can be added before the "Install dependencies" step:
      - name: Cache pip
        uses: actions/cache@v3
        with:
          path: ~/.cache/pip
          key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
          restore-keys: |
            ${{ runner.os }}-pip-
  3. Use of Environment Variables in Tests:

    • Writing secrets directly to a .env file might not be the best approach for setting environment variables, especially if the tests could be run in an environment where file persistence is a concern or where the operation might be insecure. It’s preferable to set environment variables directly in the workflow for each step where they are needed, or consider using GitHub Environment Secrets if you have environment-specific secrets.

src/helpers/open_ai.py

  1. Error Handling in call_ai_content_safety:

    • The except block currently catches all exceptions and logs an error without distinguishing between different types of errors (e.g., network timeouts, JSON parsing errors). It might be beneficial to handle different types of exceptions differently or at least log the type of exception.
    • Furthermore, the function attempts to access response_json['documentsAnalysis'][0]['attackDetected'] without null checks on the response_json and ['documentsAnalysis'][0], leading to potential KeyError or IndexError.
    • Always check response.status_code before assuming the request was successful and attempting to JSON-decode the response.
  2. No Final Newline:

    • The Python files should end with a newline character to comply with POSIX standards. This is a minor stylistic issue but could prevent potential complications with Unix-based tools and utilities.

src/helpers/prompts.py

  1. Efficiency of the Prompt Chunk Splitting:

    • The current method of splitting prompt into chunks of 10,000 characters doesn’t account for word boundaries, which might result in words being cut in half. Consider splitting on spaces near the 10,000-character limit when possible.
  2. Magic Numbers:

    • The constants 100000 and 10000 are used multiple times in check_for_prompt_inj. Consider defining these as meaningful constant variables at the top of your file/module. It improves readability and maintainability.
      MAX_PROMPT_LENGTH = 100000
      CHUNK_SIZE = 10000

src/main.py

  1. Logging Severity Level:

    • The use of different logging levels (error, info, warning) seems inconsistent. For instance, failing to read a prompt due to a missing file uses info instead of error. Reserve the error level for actual errors that administrators should take immediate action to resolve.
  2. Repeated Code for Prompt Injection Checks:

    • The code for checking prompt injections in create_predefined_query and create_info_query is nearly identical. Consider refactoring this into a single method or a middleware function to adhere to the DRY (Don't Repeat Yourself) principle.
  3. Newlines at EOF:

    • Ensure that there is a newline at the end of files, as the absence may lead to issues with some UNIX tools or when concatenating files.

Tests

  • Mock External Requests:

    • Ensure that unit tests do not make actual HTTP requests to external services. Use libraries like responses or unittest.mock to mock these out.
  • Test Coverage:

    • Ensure comprehensive test coverage, especially for the new functionality related to content safety checks. Consider edge cases and error handling paths.

General Suggestions

  • Code Formatting:

    • Use a code formatter like Black and a linter like Flake8 or Pylint to ensure code consistency and catch easy-to-miss errors.
  • Security:

    • While not directly shown in the diff, ensure that any use of external services (e.g., Azure Content Safety) adheres to security best practices, such as using secure connections and handling sensitive data correctly.
  • Documentation:

    • Update the documentation and comments to reflect changes, especially the addition of new features or the removal of deprecated ones.

Copy link

General Improvements

GitHub Actions Workflow: .github/workflows/run-test-suite.yml

  • Consistency in environment variable naming: Maintain consistency by either using uppercase with underscores for all environment variable names or follow a consistent naming convention.

yaml

  • azure_openai_api_version: "2023-12-01-preview"
Should be:
```yaml
+ AZURE_OPENAI_API_VERSION: \"2023-12-01-preview\"
  • Upgrade to the latest version of actions/checkout: As of the knowledge cutoff date, there might be newer versions of actions/checkout that offer enhanced features and security improvements. Check for the latest version and upgrade accordingly.
uses: actions/checkout@v2

Could potentially be upgraded if a newer version is available.

Python Code Improvements

Handling Configuration (src/helpers/open_ai.py and elsewhere)
  • Centralize Configuration Management: Avoid reading configuration directly within functions. Use a centralized configuration management system or class to manage configurations.
Logging and Error Reporting
  • Use structured logging for better clarity: When logging error information, including the context (like request IDs) as structured data can help with debugging and analysis.
event_logger.error(f\"{err}\")

Could be improved to:

event_logger.error(\"Failed to make request to AI Content Safety\", extra={\"error\": str(err)})
Security and Error Handling
  • Securely Handle External Requests: In src/helpers/open_ai.py, when making external requests, always ensure timeout is set to avoid hanging indefinitely.
response = requests.post(url, headers=headers, json=data)

Should be updated to include timeouts:

response = requests.post(url, headers=headers, json=data, timeout=10)
  • Validate External Responses: Before accessing response_json['documentsAnalysis'][0]['attackDetected'], validate that documentsAnalysis exists and has at least one item to avoid IndexError or KeyError.
Testing Practices
  • Extend Test Coverage: The addition of a test for prompt injection detection is good, but also consider adding tests for edge cases, error paths, and successful prompt validation to ensure comprehensive coverage.

Documentation and Comments

  • Update Documentation: When making changes to functionalities, such as adding new endpoints or changing the behavior of functions, update associated documentation and comments to reflect these changes accurately.

Code Formatting

  • Ensure Consistent Formatting: Make sure the code follows a consistent style guideline (such as PEP 8 for Python). Tools like black or pylint can automate this process.

Overall, the changes are heading in a positive direction but consideration should be given to security, error handling, and maintainability through improved coding practices and standards compliance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant